refactor(optimizer): divide logical optimizer into one for batch and one for streaming. #8192

chenzl25 · 2023-02-27T05:44:09Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Move logical optimization from PlanRoot to LogicalOptimizer.
Use static rules instead of creating new ones for queries every time.
Separate logical optimization into one for batch and one for streaming.

Checklist For Contributors

I have written necessary rustdoc comments
I have added necessary unit tests and integration tests
I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
I have demonstrated that backward compatibility is not broken by breaking changes and created issues to track deprecated features to be removed in the future. (Please refer to the issue)
All checks passed in ./risedev check (or alias, ./risedev c)

Checklist For Reviewers

I have requested macro/micro-benchmarks as this PR can affect performance substantially, and the results are shown.

Documentation

My PR DOES NOT contain user-facing changes.

Click here for Documentation

Types of user-facing changes

Please keep the types that apply to your changes, and remove the others.

Installation and deployment
Connector (sources & sinks)
SQL commands, functions, and operators
RisingWave cluster configuration changes
Other (please specify in the release note below)

Release note

#8137

codecov · 2023-02-27T06:26:23Z

Codecov Report

Merging #8192 (bfb4adf) into main (de37916) will decrease coverage by 0.02%.
The diff coverage is 90.00%.

@@            Coverage Diff             @@
##             main    #8192      +/-   ##
==========================================
- Coverage   71.69%   71.68%   -0.02%     
==========================================
  Files        1131     1132       +1     
  Lines      182332   182320      -12     
==========================================
- Hits       130723   130694      -29     
- Misses      51609    51626      +17

Flag	Coverage Δ
rust	`71.68% <90.00%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/frontend/src/optimizer/logical_optimization.rs	`89.56% <89.56%> (ø)`
src/frontend/planner_test/src/lib.rs	`74.94% <100.00%> (ø)`
src/frontend/src/optimizer/heuristic_optimizer.rs	`100.00% <100.00%> (ø)`
src/frontend/src/optimizer/mod.rs	`94.92% <100.00%> (-1.04%)`	⬇️
...rontend/src/optimizer/rule/translate_apply_rule.rs	`97.45% <100.00%> (+0.63%)`	⬆️
src/meta/src/hummock/mock_hummock_meta_client.rs	`64.94% <0.00%> (-0.52%)`	⬇️
src/batch/src/task/task_execution.rs	`51.62% <0.00%> (-0.51%)`	⬇️
src/common/src/field_generator/mod.rs	`75.00% <0.00%> (-0.50%)`	⬇️
...frontend/src/scheduler/hummock_snapshot_manager.rs	`60.00% <0.00%> (-0.48%)`	⬇️
src/storage/src/hummock/compactor/mod.rs	`80.19% <0.00%> (-0.39%)`	⬇️
... and 5 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

fuyufjh

PATL @st1page

src/frontend/src/optimizer/logical_optimization.rs

fuyufjh · 2023-02-27T10:25:18Z

src/frontend/src/optimizer/logical_optimization.rs

+        Ok(plan)
+    }
+
+    pub fn gen_optimized_logical_plan_for_batch(mut plan: PlanRef) -> Result<PlanRef> {


Hmmm. So it comes to another problem - lots of duplication code between the for_batch and for_stream. I am not sure whether it's better than before.

Somewhat agree. I think it's easier to understand in a single block and branching on batch or stream. It makes it clearer which optimizations are shared, and which apply respectively only to batch or stream.

But this assumes that the general ordering of optimization passes are shared (so far, it's been true, but I can't tell if this PR also changes that)

Somewhat agree. I think it's easier to understand in a single block and branching on batch or stream. It makes it clearer which optimizations are shared, and which apply respectively only to batch or stream.

But this assumes that the general ordering of optimization passes are shared (so far, it's been true, but I can't tell if this PR also changes that)

In the beginning, it sounds good to only have a single block and branching on batch or stream, however, It contains more and more branches gradually. At least for now, I can give about 5 optimizations that needed to be applied to the batch and stream separately. In addition, I hope this separation can help developers to think of whether their optimization can only be applied to the batch or the streaming, or both.

src/frontend/Cargo.toml

src/frontend/planner_test/tests/testdata/predicate_pushdown.yaml

src/frontend/planner_test/tests/testdata/agg.yaml

st1page

LGTM, We used to think the optimization for streaming and batch have a common phase and it save our test and coding for a while. But currently, we find that the difference between the two optimizations could be flexible. And after this pr change, we can plug in the different optimization for the two execution more flexibly.

Hmmm. So it comes to another problem - lots of duplication code between the for_batch and for_stream. I am not sure whether it's better than before.

I think to reuse the duplication code, we should extract the common and meaningful functions such as fn predicate_push_down(mut plan) ->Result<PlanRef>(in fact this piece is duplicated even in one specific execution's optimizer) or fn simple_push_down. The emphasis of the "meaningful" here is because we do not think the batch and streaming shared a common optimization logical in nature.

In summary, I think the changes are just like that we use the Combination to reuse the code instead of Inheritance here.

chenzl25 added 2 commits February 27, 2023 13:30

refactor optimizer

f4d85ef

resolve conflict

9122238

chenzl25 requested review from jon-chuang and st1page February 27, 2023 05:44

github-actions bot added the type/refactor label Feb 27, 2023

chenzl25 requested a review from fuyufjh February 27, 2023 05:47

fix

4228495

fuyufjh reviewed Feb 27, 2023

View reviewed changes

chenzl25 added 2 commits February 27, 2023 21:07

resolve comment

3551190

resolve conflict

7bcf5c2

jon-chuang reviewed Feb 27, 2023

View reviewed changes

src/frontend/planner_test/tests/testdata/predicate_pushdown.yaml Outdated Show resolved Hide resolved

jon-chuang reviewed Feb 27, 2023

View reviewed changes

src/frontend/planner_test/tests/testdata/agg.yaml Outdated Show resolved Hide resolved

st1page approved these changes Feb 27, 2023

View reviewed changes

chenzl25 added 3 commits February 28, 2023 10:58

change optimized_logical_plan back to the batch logical optimization

36ede40

fmt

3b45a2a

fix planner test

bfb4adf

chenzl25 added this pull request to the merge queue Feb 28, 2023

Merged via the queue into main with commit ac39f03 Feb 28, 2023

chenzl25 deleted the dylan/divide_logical_optimizer_into_two branch February 28, 2023 05:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(optimizer): divide logical optimizer into one for batch and one for streaming. #8192

refactor(optimizer): divide logical optimizer into one for batch and one for streaming. #8192

chenzl25 commented Feb 27, 2023 •

edited

Loading

codecov bot commented Feb 27, 2023 •

edited

Loading

fuyufjh left a comment

fuyufjh Feb 27, 2023

jon-chuang Feb 27, 2023 •

edited

Loading

chenzl25 Feb 28, 2023

st1page left a comment

refactor(optimizer): divide logical optimizer into one for batch and one for streaming. #8192

refactor(optimizer): divide logical optimizer into one for batch and one for streaming. #8192

Conversation

chenzl25 commented Feb 27, 2023 • edited Loading

What's changed and what's your intention?

Checklist For Contributors

Checklist For Reviewers

Documentation

Types of user-facing changes

Release note

codecov bot commented Feb 27, 2023 • edited Loading

Codecov Report

fuyufjh left a comment

Choose a reason for hiding this comment

fuyufjh Feb 27, 2023

Choose a reason for hiding this comment

jon-chuang Feb 27, 2023 • edited Loading

Choose a reason for hiding this comment

chenzl25 Feb 28, 2023

Choose a reason for hiding this comment

st1page left a comment

Choose a reason for hiding this comment

chenzl25 commented Feb 27, 2023 •

edited

Loading

codecov bot commented Feb 27, 2023 •

edited

Loading

jon-chuang Feb 27, 2023 •

edited

Loading